Unsupervised learning of binary vectors: a Gaussian scenario.

نویسندگان

  • M Copelli
  • C Van Den Broeck
چکیده

We study a model of unsupervised learning where the real-valued data vectors are isotropically distributed, except for a single symmetry-breaking binary direction Bin¿-1,+1¿(N), onto which the projections have a Gaussian distribution. We show that a candidate vector J undergoing Gibbs learning in this discrete space, approaches the perfect match J=B exponentially. In addition to the second-order "retarded learning" phase transition for unbiased distributions, we show that first-order transitions can also occur. Extending the known result that the center of mass of the Gibbs ensemble has Bayes-optimal performance, we show that taking the sign of the components of this vector (clipping) leads to the vector with optimal performance in the binary space. These upper bounds are shown generally not to be saturated with the technique of transforming the components of a special continuous vector, except in asymptotic limits and in a special linear case. Simulations are presented which are in excellent agreement with the theoretical results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Detection of two Gaussian clusters

We discuss the detection of two Gaussian clusters given a cloud of points. The optimal learning curve for this unsupervised learning scenario is determined with a replica calculation. A comparison with principal component analysis and supervised learning allows to understand the three diierent learning phases observed.

متن کامل

Bayesian Sparse Unsupervised Learning for Probit Models of Binary Data

We present a unified approach to unsupervised Bayesian learning of factor models for binary data with binary and spike-and-slab latent factors. We introduce a non-negative constraint in the spike-and-slab prior that eliminates the usual sign ambiguity present in factor models and lowers the generalization error on the datasets tested here. For the generative models we use probit functions, whic...

متن کامل

Unsupervised Dialogue Act Induction using Gaussian Mixtures

This paper introduces a new unsupervised approach for dialogue act induction. Given the sequence of dialogue utterances, the task is to assign them the labels representing their function in the dialogue. Utterances are represented as real-valued vectors encoding their meaning. We model the dialogue as Hidden Markov model with emission probabilities estimated by Gaussian mixtures. We use Gibbs s...

متن کامل

Unsupervised Learning of Distributions on Binary Vectors Using Two Layer Networks

abstract We present a distribution model for binary vectors, called the innuence combination model and show how this model can be used as the basis for unsupervised learning algorithms for feature selection. The model is closely related to the Harmonium model deened by Smolensky RM86]]Ch.6]. In the rst part of the paper we analyze properties of this distribution representation scheme. We show t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics

دوره 61 6 Pt B  شماره 

صفحات  -

تاریخ انتشار 2000